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Abstract 

Implicit numerical integration of nonlinear ODEs requires solving a system of nonlinear equations at 
each time step. Each of these systems is often solved by a Newton-like method, which incurs a sequence 
of linear-system solves. Most model-reduction techniques for nonlinear ODEs exploit knowledge of 
system's spatial behavior to reduce the computational complexity of each linear-system solve. However, 
the number of linear-system solves for the reduced-order simulation often remains roughly the same 
as that for the full-order simulation. 

We propose exploiting knowledge of the model's temporal behavior to 1) forecast the unknown 
variable of the reduced-order system of nonlinear equations at future time steps, and 2) use this 
forecast as an initial guess for the Newton-like solver during the rcduccd-ordcr-modcl simulation. To 
compute the forecast, we propose using the Gappy POD technique. The goal is to generate an accurate 
initial guess so that the Newton solver requires many fewer iterations to converge, thereby significantly 
decreasing the number of linear-system solves in the reduced-order-model simulation. 

Keywords: nonlinear model reduction, Gappy POD, temporal correlation, forecasting, initial guess 



1. Introduction 

High-fidelity physics-based numerical simulation has become an indispensable engineering tool 
across a wide range of disciplines. Unfortunately, such simulations often bear an extremely large com- 
putational cost due to the large-scale, nonlinear nature of high-fidelity models. When an implicit time 
integrator is employed to advance the solution in time (as is often essential, e.g., for stiff problems) this 
large cost arises from the need to solve a sequence of high-dimensional systems of nonlinear algebraic 
equations — one at each time step. As a result, individual simulations can take weeks or months to com- 
plete, even when high-performance computing resources are available. This renders such simulations 
impractical for time-critical and many-query applications. In particular, uncertainty-quantification 
applications (e.g., Bayesian inference problems) call for hundreds or thousands of simulations (i.e., 
forward solves) to be completed in days or weeks; in-the- field analysis (e.g., guidance in-field data 
acquisition) requires near-real-time simulation. 

Projection-based nonlinear model-reduction techniques have been successfully applied to decrease 
the computational cost of high-fidelity simulation while retaining high levels of accuracy. To accomplish 
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this, these methods exploit knowledge of the system's dominant spatial behavior — as observed during 
'training simulations' conducted a priori — to decrease the simulation's spatial complexity, which we 
define as the computational cost of each linear-system solveQ To do so, these methods 1) decrease the 
dimension of the linear systems by projection, and 2) approximate vector-valued nonlinear functions 
by sampling methods that compute only a few of the vector's entries (e.g., empirical interpolation 
[TJ |2], Gappy POD [5]). However, these techniques are often insufficient to adequately reduce the 
computational cost of the simulation. For example, Ref. [3] presented results for the GNAT nonlinear 
model-reduction technique applied to a large-scale nonlinear turbulent-flow problem. The reduced- 
order model generated solutions with sub-1% errors and reduced the spatial complexity by a factor 
of 637. However, the total number of linear-system solves required for the reduced-order-model simu- 
lation, which we define as the temporal complexity, remained large. In fact, the temporal complexity 
was decreased by a factor of only 1.5. As a result, the total computing resources (computing cores 
x wall time) required for the simulation were decreased by a factor of 438, but the wall time was 
reduced by a factor of merely 6.9. While these results are promising (especially in their ability to 
reduce spatial complexity), the time integration of nonlinear dynamics remains problematic and often 
precludes real-time performance. 

The goal of this work is exploit knowledge of the system's temporal behavior as observed during 
the training simulations to decrease the temporal complexity of (deterministic) reduced-order-modcl 
simulations. For this purpose, we first briefly review methods that exploit observed temporal behavior 
to improve computational performance. 

Temporal forecasting techniques have been investigated for many years with a specific focus on re- 
ducing wall time in a stable manner with maximal accuracy. The associated body of work is large and 
a comprehensive review is beyond the scope of this paper. However, this work focuses on time integra- 
tion for reduced-order models plagued by highly nonlinear dynamics; several categories of specialized 
research efforts provide an appropriate context for this research. 

At the most fundamental level of temporal forecasting, a variety of statistical time-series-analysis 
methods exist that exploit 1) knowledge of the temporal structure, e.g., smoothness, of a model's 
variables, and 2) previous values these variables for the current time series or trajectory. The connection 
between these methods and our work is that such forecasts can serve as an initial guess for an iterative 
solver (e.g., Newton's method) at an advanced point in time. However, the disconnect between such 
methods and the present context is that randomness and uncertainty drive time-series analysis; as such, 
these forecasting methods are stochastic in nature (see Refs. [5] [8] [101 HH H2]). In addition, 
the majority of time-series analyses have been applied to application domains (e.g., economics) with 
dynamics that are not generally modeled using partial differential equations. Finally, such forecasting 
techniques to not exploit a collection of observed, complete time histories from training experiments 
conducted a priori. Because such training simulations lend important insight into the spatial and 
temporal behavior of the model, we are interested in a technique that can exploit these data. 

As fundamental as time-series analysis, time integrators for ordinary differential equations (ODEs) 
employ Taylor-series expansions to provide reasonably accurate forecasts of the state or the unknown 
at each time step. Time integrators employ such a forecast for two purposes. First, algorithms with 
adaptive time steps employ interpolation to obtain solutions (and their time derivatives) at arbitrary 
points in time. Implicit time integrators for nonlinear ODEs, which require the iterative solution to 
nonlinear algebraic systems at each time step, use past history (of the current trajectory) to forecast 
an accurate guess of the unknown in the algebraic system, e.g., Ref. |13j . Again, forecasting by 
Taylor-series expansion makes no use of the temporal behavior observed during training simulations. 

Closely connected to time integration but specialized to leverage developments in high-performance 



1 A sequence of linear systems arises at each time step when a Newton-like method is employed to solve the system 
of nonlinear algebraic equations. 
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computing, time parallel methods can offer computational speedup when integrating ODEs. Dating 
back to before the general availability of parallel computers, researchers speculated about the benefits 
of decomposing the temporal domain across multiple processors [14] . Advancements have been made 
from parallel multigrid to parareal techniques [151 116L I17L I18j . Although time-domain decomposition 
algorithms have demonstrated speedup, they are limited in comparison to the spatial domain decom- 
position methods and they require a careful balance between stability and computational efficiency 
[TT)] . It is possible that these methods could further improve performance in a model-reduction setting 
[20 (and could complement the method proposed in this work), but near real-time performance is 
likely unachievable through time-parallel methods alone. 

To some extent, exploiting temporal behavior has been explored in nonlinear model reduction. 
Bos et al. |21j proposed a reduced-order model in the context of explicit time integration wherein the 
generalized coordinates are computed based on a best-linear-unbiased (BLU) estimate approach. Here, 
the reduced state coordinates at time step n + 1 are computed using empirically derived correlations 
between the reduced state coordinates and 1) their value at the previous time step, 2) the forcing input 
at the previous time step, and 3) a subset of the full-order state. However, the errors incurred by this 
time-integration procedure (compared with standard time integration of the reduced-order model) are 
not assessed or controlled. This can be problematic in realistic scenarios, where error estimators and 
bounds are essential. Another class of techniques called a priori model reduction methods [22, 23 build 
a reduced-order model 'on the fly', i.e., over the course of a given time integration. These techniques 
try to use the reduced-order model at as many time steps as possible; they use the high-fidelity model 
when the reduced-order model is deemed to be inaccurate. So, these techniques employ the reduced- 
order model as a tool to accelerate the high-fidelity- model simulation. In contrast, this work aims to 
accelerate the reduced-order-model simulation itself. Further, these methods differ from the present 
context in that there are no training experiments conducted a priori from which to glean information 
about the model's temporal behavior. 

In this work, we propose a method that exploits a set of complete trajectories observed during 
training simulations to decrease the temporal complexity of a reduced-order-model simulation. The 
method 1) forecasts the unknown variable in the reduced-order system of nonlinear equations, and 2) 
uses this forecast as an initial guess for the Newton-like solver. To compute the forecast, the method 
employs the Gappy POD method [3J, which extrapolates the unknown variable at future time steps 
by exploiting 1) the unknown variable for the previous a time steps (where a is the memory of the 
process), and 2) a database of time histories of the unknown variable. If the forecast is accurate, then 
the Newton-like solver will require very few iterations to converge, thereby decreasing the number 
of linear-system solves in the simulation. The method is straightforward to implement: the training 
stage simply requires collecting an additional set of snapshots during the training simulations. The 
reduced-order-model simulation simply requires an external routine for determining the initial guess 
for the Newton-like solver. 



2. Problem formulation 

This section provides the context for this work. Section |2~T] describes the class of full-order models 
we consider, which includes first- and second-order ODEs numerically solved by implicit time integra- 



tion. Section 2.2 describes the reduced-order modeling strategies for which the proposed technique is 
applicable. 
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2.1. Full-order model 



2.1.1. First- and second- order ODEs 

First, consider the parameterized nonlinear first-order ODE corresponding to the full-order model 
of a dynamical system: 

x = f(x;t,p(t),q) (1) 
x(0,p,q)=x°(q). (2) 

Here, time is denoted by t £ [0, T], the time-dependent forcing inputs are denoted by p : [0, T] — > K p , 
the time-independent parametric inputs are denoted by q £ K q , and / : x [0, T] xl p xl q -) R N 
is nonlinear in at least its first argument. The state is denoted by x = x(t,p, q) £ with N denoting 
the number of degrees of freedom in the model. The parameterized initial condition is x° : MP — > M. N . 

Because this work handles both first- and second-order ODEs, consider also the parameterized 
nonlinear second-order ODE corresponding to the full-order model of a dynamical system: 

x = g(x,x;t,p(t) ,q) (3) 
x(0,p,q)=x°(q) (4) 
x(0,p,q) = v°(q). (5) 

Here, the function g : R N x R N x [0, T] x W x R q — > R N is nonlinear in at least its first two arguments, 
and the parameterized initial velocity is denoted by v° : M. p — > R^j^J 

2.1.2. Implicit time integration 

Given forcing and parametric inputs, the numerical solution to the full-order model described by 
Q-© or |5J-@ can be computed via numerical integration. For systems exhibiting stiffness, an 
implicit integration method is often the most computationally efficient choice; it is even essential 
in many cases |24j . When an implicit time integrator is employed, s coupled systems of nonlinear 
equations are solved at each time step n = 1, . . . M: 

R?(w n > 1 > ... > w n ' s ;p,q)=0, i=l,...,s. (6) 

Here, the function i?" : R^ x • • • x R N xK p xK'-> R N is nonlinear in at least its firs t s arguments and 
the unknowns w 11 ' 1 £ R N , i = 1, . . . , s are implicitly defined by (|6]). As discussed in Appendix A and 



Appendix B[ the unknowns w n,t represent the state, velocity, or acceleration at points t n 1 + Cih n 



where c% £ [0, 1], i = 1, . . . , s is defined by the time integrator: 

w n,i = w ».ifa q ) eee wit"- 1 + ah n ;p,q). (7) 

n 

So, a superscript n denotes the value of a quantity at time t n = h k , a superscript n, i denotes the 

fc=i 

n-i 

value of a quantity at time t n ' % = ^ h k + Cih n , and h denotes the time-step size. 

k=l 

After the unknowns are computed by solving Eq. ([6]), the state is explicitly updated as 

s 

x n = 1 x' 1 - 1 + ^5 t w n >\ (8) 



2 Note that an JV-dimensional second-order ODE can be rewritten as 27V-dimensional first-order ODE. 
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where 7 and Si, i = 1, . . . , s are scalars defined by the integrator. For second-order ODEs, the velocity 
is also updated explicitly as 



ex n -' +Yj iW n '\ (9) 



where e and £j, i = 1, . . . , s are also scalars defined by the integrator. Appendix A and Appendix B 



specify the form of Eqs. (|6])-(|9]) for important classes of implicit numerical integrators for first- and 
second-order ODEs, respectively. 

The chief computational burden of solving ([!]) with an implicit integrator lies in solving nonlinear 
equations ^ at each time step; this is typically done with a Newton-like method. In particular, if K 
denotes the average number of Newton-like iterations required to solve ([6]) , then the full-order-model 
simulation requires solving KM linear systems of dimension sA^j^j We denote the simulation's spatial 
complexity to be the computational cost of solving each linear system; we consider the simulation's 
temporal complexity to be the total number of linear-system solves. 

The spatial complexity contributes significantly to the computational burden for large-scale systems 
because N is large. However, the temporal complexity is also significant for such problems. First, the 
number of total time steps M is often proportional to N . This occurs because refining the mesh in space 
often necessitates a decrease in the time-step size to balance the spatial and temporal errors]^] Second, 
the average number of Newton-like iterations K can be large when the problem is highly nonlinear 
and large time steps are taken, which is common for implicit integrators. Under these conditions, 
the initial guess for the Newton solver, which is often taken to be a polynomial extrapolation of the 
unknown, can be far from the true value of the unknowns. 

In many cases (e.g., linear multi-step methods, single-stage Runge-Kutta schemes), s = 1. For this 
reason, and for the sake of notational clarity, the rest of this paper assumes s = 1, and w n designates 
the value of the unknown variable at time i"' . However, we note that the proposed technique can be 
straightforwardly extended to s > 1. 



2.2. Reduced-order model 

Nonlinear model-reduction techniques aim to generate a low-dimensional model that is inexpensive 
to evaluate, yet captures key features of the full-order model. To do so, these methods first perform 
analyses of the full-order model for a set of ritrain training parametric and forcing inputs D tra i n = 
{(p k : Q k )}k=i n during a computationally intensive 'offline' training stage. These analyses may include 
integrating the equations of motion, modal decomposition, etc. 

Then, the data generated during these analyses are employed to decrease the the cost of each linear- 
system solve via two approximations: 1) dimension reduction, 2) nonlinear-function approximation 
(spatial-complexity reduction). Once these approximations are defined, the resulting reduced-order 
model is employed to perform computationally inexpensive analyses for any inputs (p, q) ^ Strain 
during the 'online' stage. 



3 Assuming the Jacobian of the residual is sparse with an average number of nonzeros per row ui -C N, the dominant 
computational cost of solving Eqs. lj6| for the entire simulation is O (u 2 sN KM^j if a direct linear solver is used. It 
is O (LuisN K M) if an iterative linear solver is used. Here, L denotes the average number of matrix-vector products 
required to solve each linear system in the case of an iterative linear solver. 

4 This is not necessarily true for explicit time-integration schemes, when the time-step size is limited by stability 
rather than accuracy. In this case, Krysl et al. 1251 showed that employing a low-dimensional subspace for the state 
improves stability and therefore permits a larger time-step size. As a result, the reduced-order state equations can be 
solved fewer times than the full-order state equations. 
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2.2.1. Dimension reduction 

Model-reduction techniques decrease the number ol degrees of freedom by computing an approxi- 
mate state x ss x that lies in an affine trial subspace of dimension N <C N: 



x(t,P, q) = x° (q) + $x(t,p, q) 
x(t,p,q) = $x(t,p,q) 
x(t,p,q) = ®x(t,p,q). 



(10) 

(11) 
(12) 



Here, the trial basis (in matrix form) is denoted by $ = [<j>i ■ ■ ■ (p^\ G M. NxN with <J> T $ = /. The 

generalized state is denoted by x = \xi ■ ■ ■ x^ T E R^. When the unknown variable computed at 
each time step (see Section 2.1.2) corresponds to the state, velocity, or acceleration, we can express it 
as 

w(t,p,q) = w°(q) + $w(t,p,q), (13) 



where w = [i 



W N1 



cJY 



denotes the vector of generalized unknowns. 



Substituting Eqs. @-([II]) into ([l] yields 

<Z>X = f(x°(q) + $x;t,p(t),q) 
Alternatively, substituting Eq. ( 10 )-( 12 ) into ^ yields 

<$>x = g{x° (q) + $x,<f>x;t,p(t),q) . 



(14) 



(15) 



The overdetermined ODEs described by (14) and (15) may not be solvable, because image(/) <f_ 
range($) and image(g) 2! range($) in general. Several methods exist to compute a solution. 

Project, then discretize in time. This class of model-reduction methods first carries out a projection 
process on the ODE followed by a time-integration of the resulting low-dimensional ODE. The (Petrov- 
Galerkin) projection process enforces orthogonality of the residual corresponding to overdetermined 
ODE ( [T4| or ( 15 ) to an TV-dimensional test subspace range (*), with * G R NxN . For first-order ODEs, 
this leads to 



x = (* T $) 1 -* T f (x° (q) + <S>x;t,p (t) , q) . 
For second-order ODEs, the result is 

I = (* T $) ~* ^ T g (x° (q) + $£, $i; t, p (t) , . 
Galerkin projection corresponds to the case where \& = <f>. 



(16) 



(17) 



Because Eq. (161 (resp. (17)) is an ODE of the same form as (Tj) (resp. ([3])), it can be solved using 
the same numerical integrator that was used to solve ([I]) (resp. (|3)). Further, the same time-step sizes 
are often employed, as the time-step size is determined by accuracy (not stability) for implicit time 
integrators. For both first- and second-order ODEs, this again leads to a system of nonlinear equations 
to be solved at each time step n = 1, . . . , M: 



(* T $) 1 * T i?" (w° (q) + $w n ;p,q) = 0. 



(18) 



The unknown w n can be computed by applying Newton's method to (18). Then, the explicit updates 
([8])-([9]) can proceed as usual. 
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Discretize in time, then project. This class of model-reduction techniques first applies the same nu- 
merical integrator that was used to solve ([I]) to the overdetermined ODE (14 1 or (15 1. However, the 



resulting algebraic system of N nonlinear equations in N unknowns remains overdetermined: 

R n (w°(q) + <S>w n ;p,q) = 0. (19) 



To compute a unique solution to ( |19[ ), orthogonality of the discrete residual R n to a test subspace 
range ('J') can be enforced. However, this leads to a reduced system of nonlinear equations equivalent 



to (18 1. So, in this case, the two classes of model-reduction techniques are equivalent. 



On the other hand, to compute a unique solution to (19), the discrete-residual norm can be mini- 
mized [23 HI [37J HE1 HH] i which ensures discrete optimality~$\ : 

W n = arg min \\R n (w° (q) + <5>y; Pl q) |||. (20) 

The unknown w n can be computed by applying a Newton-like nonlinear least-squares method (e.g., 
Gauss-Newton, Levenberg-Marquardt) to (20 1. Again, explicit updates ([8])-([9]) can proceed after the 
unknowns are computed. 

2.2.2. Spatial- complexity reduction 

For nonlinear dynamical systems, the dimension reduction described in Section [2.2.1| is insufficient 
to guarantee a reduction in the computational cost of each linear-system solve. The reason is that 
the full-order residual depends on the state, so it must be recomputed and subsequently projected or 
minimized at each Newton-like iteration. 

For this reason, nonlinear model-reduction techniques employ a procedure to reduce the spatial- 
complexity, i.e., decrease the computational cost of computing and projecting or minimizing the nonlin- 
ear residual]^] In particular, the class of 'function sampling' techniques replace the full-order nonlinear 
residual with an approximation R w R that is inexpensive to compute. Then, R n R n is employed 



in (18 1 or ( |20[ ) to compute the unknowns w n . 

Methods in this class can be categorized as follows: 

1. Collocation approaches. These methods employ a residual approximation that sets many of the 
residual's entries to zero: 

R n = Z T ZR n . (21) 

Here, Z is a restriction matrix consisting of selected rows of Inxn- This approach has been 
developed for Galerkin projection [301 H2] and discrete-residual minimization [25] . 

2. Function-reconstruction approaches. These methods employ a residual approximation that com- 
putes a few entries of the residual or nonlinear function, and subsequently 'fills in' the remaining 
entries via interpolation or least-squares regression. That is, these methods apply one of the 
following approximations: 

R n = (Z$ fl ) + ZR n (22) 

f=$ f (Z*f)+Zf (23) 

g = $ 9 (Z<D S )+ Zg. (24) 

Here, $/, and $ g are empirically derived bases used to approximate the nonlinear residual, 
velocity, and acceleration, respectively]^] A superscript + denotes the Moore-Penrose pseu- 
doinverse. This approach has been developed for Galerkin projection [301 HH 12 Ell 132] and 
discrete-residual minimization 26, [4]. 



5 Such techniques are occasionally referred to as 'hyper-reduction' techniques |22| . 

6 When the bases are computed via POD, this technique is known as Gappy POD [3]. 
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3. Temporal-complexity reduction 



While the model-reduction approaches described in the previous section decrease the computational 
cost of each linear-system solve (i.e., spatial complexity), they do not necessarily decrease the number 
of linear-system solves (i.e., temporal complexity). The goal of this work is devise a method that 
decreases this temporal complexity while introducing no additional error. 



3.1. Method overview 

The main idea of the proposed approach is to compute an accurate forecast of the generalized 
unknowns at future time steps using the Gappy POD procedure, and employ this forecast as an initial 
guess for the Newton-like solver at future time steps. 

Gappy POD is a technique to reconstruct vector-valued data that has 'gaps,' i.e., entries with 
unknown or uncomputed values. Mathematically, the approach is equivalent to least-squares regression 
in one discrete-valued variable using empirically computed basis functions. It was introduced by 
Everson and Sirovich [3] for the purpose of image reconstruction. It has also been used for static 
[33] [34] and time-dependent [35j [36] flow field reconstruction, inverse design [34], design variable 
mapping for multi-fidelity optimization [37 , and for decreasing the spatial complexity in nonlinear 
model reduction [SUl [ZD HS1 H] • This work proposes a novel application of Gappy POD: as a method 
for forecasting the generalized unknown at future time steps during a reduced-order-model simulation. 

During the offline stage, the proposed method computes a 'time-evolution basis' for each general- 
ized unknown Wj, j = 1,...,N. Each basis represents the time-evolution of a generalized unknown 



as observed during training simulations. Figure 1(a) depicts this idea graphically, and Section 3.2 



describes a computationally inexpensive way to compute these bases. 




time time 

(a) Offline: the computed time-evolution POD basis (b) Online: time steps taken so far (red), recent time 
for a generalized unknown. steps used to compute forecast (green), forecast (blue) 



Figure 1: Graphical depiction of the proposed method 



During the online stage, the method computes a forecast of the generalized unknowns at future 
time steps via Gappy POD. This forecast employs 1) the time-evolution bases and 2) the generalized 
unknowns computed at several previous time steps. Figure 1(b) depicts this, and Section 3.3 describes 
the forecasting method in detail. At future time steps, this forecast is employed as an initial guess 
for the Newton-like solver. If the forecast is accurate, the Newton-like solver will converge in very few 
iterations; if it is inaccurate, the Newton-like solver will require more iterations for convergence. Note 
that the accuracy of the solution is not hampered in either case (assuming a globalization strategy is 
employed). If the number of Newton steps required for convergence is large, this indicates an inaccurate 
initial guess. When this occurs, the method computes a new forecast. 

The proposed method is expected to be effective if the temporal behavior of the generalized un- 
knowns is similar across input variation. The method is independent of the dimension-reduction or 
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spatial-complexity-reduction scheme employed by the reduced-order model; further, the method is ap- 
plicable (without modification) to both first- and second-order ODEs. The next sections describe the 
offline and online steps of the methodology in detail. 

3.2. Offline stage: compute the time-evolution bases 

The objective of the offline stage is to compute the time-evolution bases that will be used for the 
online forecast. Ideally, the bases should be able to describe the time evolution of the generalized state 
for any forcing inputs p and parametric inputs q. If the bases are 'bad', then the forecasting step of 
the algorithm will be inaccurate, and there may be no reduction in the average number of Newton-like 
iterations. 

We propose employing a POD basis for the time evolution of the generalized unknown. This basis 
is computed a priori during 'offline' simulations of the reduced-order model in three steps: 

1. Collect snapshots of the unknown during each of the n tra in training simulations: 

Y k = [w° (p fc) q k ) ■■■ w M - X (p k , q k )] (25) 

for k = 1, . . . , ntrain, with Y k e R NxM . Here, p k e R p denotes the forcing inputs for training 
simulation k, and q k G M. q denotes the parametric inputs for training simulation k. 
In some cases, the snapshot matrices Y k , k = 1, . . . , n tra in are already available from computing 
the trial basis $. This occurs, for example, when proper orthogonal decomposition (POD) is 
employed to compute $ and the time integrator's unknown is the state vector. 

2. Compute the corresponding snapshots of the generalized unknown: 

% = <f> T [Y k - w" (q k ) 1 T ] (26) 
= [w°(p k ,q k ) ■■■ w^ip^qk)] (27) 

for k — I, ntrain, where orthogonality of the trial basis $ T $ = / has been used. Here, 
Y k e R^xM and 1 £ m m denotes a vector of ones. 

3. Compute the time-evolution bases via the singular value decomposition. Defining Y k = yi ik ■ ■ ■ k 

where y^ k e M. M can be interpreted as a snapshot of the time evolution of the jth generalized 
unknown Wj during training simulation k, this step amounts to 

' ' ' hn^J = U^Vj (28) 
Sj = [uj,i ■■■ u j>aj ] , (29) 

for j = 1, . . . , N. Here, Uj = [u jA ■ ■ ■ Uj> train ] G R Mx "trai„ anc j a . < ntrain . 

After the time-evolution bases € R Mxaj , j = I, ... ,7V have been determined during the offline 
stage, they can be used to accelerate online computations via forecasting. The next section describes 
this. 



T 



3.3. Online stage: forecast 

During the online stage, the method employs a forecasting procedure to define the initial guess for 
the Newton-like solver. To compute this forecast, it uses the time evolution bases (computed offline), 
and the values of the generalized unknown at the previous a time steps (computed online). Here, a is 
considered the 'memory' of the process. When the number of Newton iterations exceeds a threshold 
value r, this indicates a poor forecast; in this case, the forecast is recomputed. If the forecast is 
accurate, then the number of iterations needed to converge from the (improved) initial guess will be 
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Algorithm 1 Online: temporal-complexity-reduction method 



Input: Time-evolution bases Sj € M. Mxaj , j = 1, . . . , N; maximum memory a max with 
a max > maxoj-; Newton-step threshold r 

3 

Output: Generalized state at all M time steps: x n , n = 1, . . . , M 



for n = 1, . . . , M do 

if forecast + C\h n ) is available then 

Set initial guess for Newton solver to li)™^ = w^V 1 ^ 1 + cih n ), j = 1, . . . , N 
else 

Use typical initial guess for Newton solver (e.g., polynomial extrapolation of unknown) 
end if 



Solve reduced-order equations ( 18 1 or |20|) with a Newton-like method and specified initial guess. 
The number of Newton-like iterations required for convergence is denoted by K 
8: if K > r and (n — 1) > maxaj then {recompute forecast} 

3 

9: Set memory a <— min(n — 1, a max ) 

10: Compute forecasting coefficients zj, j = 1, . . . ,N using the unknown at the previous a time 



steps (see Eq. ( 30 )) 

Set forecast to be w^- = SjZj and define Wj = hr 1 (Wj) , j = 1, . . . , N 
end if 
end for 



drastically reduced, thereby decreasing K and hence the temporal complexity. Algorithm [T] outlines 
the proposed technique. 

To compute the forecasting coefficients in step 10 of Algorithm [l] we propose using the Gappy 



POD approach of Everson and Sirovich [3]. This approach computes coefficients Zj via the following 
linear least-squares problem: 

Zj — arg min \\Z(n,a)EjZ — Z(n,a)h(wj)\\ (30) 

Here, the matrix Z(n,a) € R qxM is the restriction matrix that selects entries corresponding to the 
previous a time steps: 

Z(n, a) = [e n - a -i ■■■ e„_i] T , (31) 
where e.; denotes the ith canonical unit vector. Note that a > an is required for there to be a unique 



h : x i— > x with x = [xi ■ ■ • xa/] T G 



solution to (30). The function h in (30) 'unrolls' time according to the time discretization; we define 



x n = x(t n - 1 + Cl h n ), n=l,...,M. (32) 



4. Numerical experiments 



These numerical experiments assess the performance of the proposed temporal-complexity-reduction 
method on a structural-dynamics example using three reduced-order models: Galerkin projection (Eq. 



(17) with = $), Galerkin projection with collocation (Eq. (|2l[)), and Galerkin projection with 



least-squares reconstruction of the residual (Eq. (22)). We consider a sequence of problems that pose 
increasing difficulty to the method. 



Section |4.2| considers the ideal scenario for the method: the online inputs are identical to the 
training inputs, and the reduced bases are not truncated. In this case, the temporal behavior of the 
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system is exactly predictable, because (in exact arithmetic) the online response is the same as the 
training response. Therefore, we expect the proposed method to work extremely well in this case. 

Section 4.3 assesses the method's performance in a more challenging setting. Here, the online 
inputs are different from the training inputs (i.e., predictive scenario), so the temporal behavior is 
not identical to that observed during the training simulations. The parametric inputs correspond 
to material properties and shape parameters, and the external force is set to zero; this leads to a 
free- vibration problem. As a result, the dynamics encountered in this example are relatively smooth. 

Section [4~4| considers a more challenging predictive scenario wherein rich dynamics — generated from 
a high-frequency external force — characterize the response. Here, the parametric inputs correspond to 
the magnitudes of the high-frequency forces. All material-property and shape variables are held fixed. 

Finally, Section |4.5| pushes the method to its limit by considering a predictive scenario characterized 
by high-frequency external forces as well as variations in material properties and shape variables. 



4-.1- Problem setup 

Figure 2(a)| depicts the parameterized, non-conservative clamped-free truss structure, where the 
arrows indicate the initial displacement of magnitude c. The full-order model is constructed by the 





(a) Geometric parameters, initial displacement (red) (b) External force 1 (blue), 2 (green), and 3 (red) 



Figure 2: Clamped-free parameterized truss structure 



finite-element method. It consists of sixteen three-dimensional bar elements per bay with three degrees 
of freedom per node; this results in 12 degrees of freedom per bay. We consider a problem with 750 
bays, which leads to 9 x 10 3 degrees of freedom in the full-order model. The bar elements model 
geometric nonlinearity, which results in a high-order nonlinearity in the strain energy. Each bay has a 
(unitlcss) depth of 1 and a cross-sectional area determined by the parameterized geometry. 
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The parameters G [— 1, 1], i = 1, . . . , 9 have the following effect on the model: 



density 


= 1 - 


h O.591 


bar area 


= 1 - 


1- O.592 


modulus of elasticity 


= 1 - 


1- O.593 


base width a 


= 1 - 


1- O.594 


base height b 


= 1 - 


1- O.595 


initial end displacement c 


= 1 - 


1- 0.59 6 


magnitude 71 


= 0.5(l + g7 ) 


magnitude 72 


= 0.5(l + 9 8 ) 


magnitude 73 


= 0.5(1 + 99) 



The equations of motion for this model are 

M (q)x + C (q)x + / int (x; 9) = p(t; 9). (33) 

Here, M (9) £ ~R NxN denotes the symmetric-positive-definite mass matrix, the internal forced is de- 
noted by /i n t : M. N x T> —> R N with (x, q) 1— > /; nt (x; q), and the symmetric-positive-semidefinite Rayleigh 
viscous damping matrix, denoted by C (9) € R NxN , is of the form 

C (9) = aM (9) + ^Vx/mt (x°; 9) . (34) 

Note that V x /i n t (a; ; 9) represents the tangent stiffness matrix at the initial condition. In this work, 
the Rayleigh coefficients a and /3 are determined by matching the first two damped frequencies with 
the first two undamped frequencies of the nominal structure (9^ = 0, i = 1, . . .6), while enforcing a 
damping ratio of C = sin -1 (2°) for the first two modes (38) Eq. (7)]. 
We consider an external force composed of three forces: 

3 

P(q,t) = 'Y^T i ri{q,t), (35) 

i=l 

piV 



where r. 4 e R and r % : R q x [0,T] ->• R for i = 1, . . . , 3. Figure |2(b)| depicts the forces' spatial 
distributions, which lead to vectors r^, i = 1,...,3 through the finite-element formulation. The 
parameterized, time-dependent magnitudes of these functions are: 

ri (q,t) = 7l (9) l R+ (t-T/2) sin (A! (t - T/2)) (36) 
r 2 (q, t) = 72 (q) 1 R+ (t - T/2) sin (A 2 (t - T/2)) (37) 
r S (g,*) = 73(g) l K+ (t- T/2) sin (A3 (t - T/2)), (38) 

where 7^, i = 1, . . . , 3 denotes the maximum force magnitudes, and 

1 ( ) I 1 ' T G ^ (39) 
1 0, otherwise 

is the indicator function. The frequencies are (arbitrarily) set to Ai = u;@, A2 = 5u; 6 , A3 = 0.buj e , where 
(jJq denotes the sixth largest natural frequency of the structure in its nominal configuration (9.; = 0, 
» = !,.. .6). 
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The equations of motion (33 1 can be rewritten in the standard form of Eqs. — as 

x = M (q)- 1 (p(t; q)-C (q)x - f int (x; q)) (40) 

x(0,p,q)=x°(q) (41) 

x(0,p,q) = v°(q). (42) 

The nonlinear function defining the second-order ODE is 

g (x, x; t,p, q) = M (g) -1 (p(t; q) - C (q)x - f int (x; q)) . (43) 



We employ an implicit Nystrom time integrator to compute the numerical solution to Eqs. (40) 



(42). In particular, we employ the implicit midpoint rule for both partitions. This leads to discrete 



equations (B.2) to be solved at each time step with explicit updates (B.3)-(B.4| and parameters s = 1, 
an = 1/2, an = 1/4, h\ =l,b\ = 1/2, c\ = 1/2. The unknowns are equivalent to the acceleration at 
the half time steps: w n — x (i™ -1 + l/2h\ n = 1, . . . , M. Multiplying the corresponding residual by 
M (q) yields 

R n (w n ) = M (q)w n + C (q) 



(44) 

To solve R n (w n ) — at each time step, Newton's method is applied. Each linearized system is solved 
directly using the Cholesky factorization (the Jacobian of the residual is symmetric) , and convergence 
of Newton's method is declared when the 2-norm of the residual is less than or equal to 10 _4 ||i?" (0) ||2- 

The time-interval length is set to T — 25. A time-step size of h — 0.6 is employed in the unforced 
case; it is set to h = 0.5 in the forced case. These values were determined by a convergence study 
on the nominal configuration defined by qi = 0, i = 1, . . . 6 and (ft = —1, i = 7, . . . , 9 (unforced) or 
qt = —1, i — 7, ... ,9 (forced). 

To construct the reduced-order models, we collect snapshots of the required quantities for (p, q) G 
-Dtrain and t G [0, T\. The trial basis $ is determined via POD. Snapshots of the state are collected 

X x = {x n - 1 +hx n - 1 + ^x n ' 1 -x(0;q) \n = l,...,M; (q,p) G Aram} (45) 

and the trial basis is set to $ = <I> e (X x , v x ) : where v x G [0, 1] is an 'energy criterion' and <J> e is defined 
by Algorithm [2] in |Appcndix C For Galerkin projection with least-squares residual reconstruction, 



the following snapshots are collected during the (full-order model) training simulations: 

X R - {R n (w n(k ^ | n = 1, . . . ,M; k = 0, . . . , K{n) - 1; (q,p) G Ami*}- (46) 

Here, K(n) denotes the number of Newton steps taken at time step n. The basis is set to = 
$ e (Xr, vr) with vr G [0, 1]. The same sampling matrix Z is used for the collocation and Gappy POD 
approximations; it is determined using the GNAT model-reduction method's approach for selecting 
the sample matrix [26, 4 . The number of rows in Z is set to be twice the number of columns in 

The output of interest is the downward displacement of the bottom-left point at the tip of the 
structure (denoted by d) for all time steps. To quantify the performance of the reduced-order models, 
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the following metrics are used: 



M 

M 

E \d n ~ "FOMl 
maxdg OM - mindg OM 

-f^FOM 
K = = 

K 

c TpQM 



(47) 

(48) 
(49) 



Here, e designates the scaled l\ norm of the discrepancy in the output predicted by a reduced-order 
model. The temporal-complexity-reduction factor is denoted by k, where K denotes the average 
number of Newton-like steps per time step. The speedup is denoted by S with T denoting the wall 
time required for a simulation. A subscript 'FOM' denotes a quantity computed using the full-order 
model. 

In all experiments, the forecasting method is compared with a 'no forecasting' case. For this 
case, the initial guess corresponds to a hrst-order approximation of the displacement within the time 
interval: x^(t) = x"- 1 + (t- t"" 1 ) i"^ 1 for t G [t n - 1 ,t n ] , or equivalently i™' 1 = w n ^ = 0. Also, 
the forecasting method always employs untruncated time-evolution bases: aj = ritrain for j = 1, . . . , N. 



4-. 2. Ideal case: invariant inputs, no truncation of bases 

This experiment explores the ideal case for the method: the online inputs equal the training inputs, 
and the bases are not truncated {y x — Vr = 1.0). In this scenario, the full-order model's temporal 
behavior encountered online is exactly the same as that observed during training simulations; for this 
reason, we expect the proposed method to perform very well. We consider a single configuration 
(strain = 1) characterized by qi = 0, i = 1, . . . , 9. For the forecasting technique, the Newton-step 
threshold is set to r — and the maximum memory is set to a max = 12. 

Table [T] and Figure [3] report the results. First, note that the relative errors generated by ROMs 
are essentially zero. This is expected, because the reduced bases are not truncated and the inputs are 
invariant. Next, note that the Galerkin ROM generates no speedup; this is expected because it is not 
equipped with a spatial-complexity-reduction technique (see Section 2.2.2 ). The other two techniques — 
which employ spatial-complexity-reduction approximations — lead to significant speedups. Also, it is 
evident that the reduced-order models exhibit no temporal-complexity reduction (i.e., k < 1.0) in the 
absence of the proposed forecasting technique. 

When the models employ the proposed forecasting technique, the number of Newton iterations dras- 
tically decreases, leading to temporal-complexity reductions of k = 49.5 for two ROMs and k — 6.25 
for the third. In turn, this leads to significantly improved wall-time speedups in all cases. This can 
be viewed as the best possible performance for the method (applied to this problem): the temporal 
behavior of the system is exactly predictable, as the inputs have not changed, and the reduced bases 
have not been truncated. So, the forecast is 'perfect' after only one time step for the Galerkin and 
Galerkin with Gappy POD ROMs; no Newton steps are required after this. The next sections investi- 
gate the forecasting method's performance in the (more realistic) case of varying inputs and truncated 
bases. 



4-3. Unforced dynamics, varying structure 

This experiment assesses the performance of the method when applied to the problem in the absence 
of external forcing, but subject to changes in the structure's shape, material properties, and initial 
condition. This will test the method for a problem exhibiting 'smooth' dynamics, as it amounts to 
a free vibration problem. We randomly choose six training configurations (n tra in = 6) for which we 
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ROM method 


relative 
error e 


No forecasting 


With forecasting 


Newton 
its KM 


speedup 
S 


reduction 
factor k 


Newton 
its KM 


speedup 
S 


reduction 
factor k 


Galerkin 


8.64 x KT 12 


99 


1.01 


1.0 


2 


1.84 


49.5 


Galerkin + Gappy POD 


8.64 x 1CT 12 


99 


36.4 


1.0 


2 


69.3 


49.5 


Galerkin + collocation 


2.12 x 1CT 5 


100 


36.5 


0.99 


16 


61.9 


6.25 



Table 1: Limiting case: forecast performance 




Figure 3: Limiting case: all ROMs generate near-zero error 



collect snapshot in the offline stage Aram = {p(q aam P le < k ) , gBample,k}6 =i; then ^ we deploy the ROMs on 
two randomly chosen online configurations q*' 1 and q*' 2 . Table [2] reports the values of the parametric 
inputs for these configurations. 



configuration 


91 


92 


93 


94 


95 


96 


^sample, 1 


-0.2999 


-0.9735 


0.2374 


0.5872 


0.9248 


0.4786 


^sample, 2 


0.0660 


0.3187 


-0.1555 


-0.8393 


0.6152 


0.3911 


^sample, 3 


0.2395 


-0.6085 


0.2981 


-0.8126 


0.6181 


-0.3415 


^sample, 4 


-0.0033 


0.3666 


-0.5749 


-0.4519 


0.2338 


0.0790 


^sample, 5 


-0.5650 


-0.9869 


0.5075 


0.2002 


0.8572 


0.8681 


^samplc,6 


0.9286 


-0.8243 


-0.4919 


-0.9116 


-0.7414 


0.3112 


q *,i 


-0.8722 


-0.3269 


-0.5777 


-0.4507 


0.3181 


-0.1177 


q*> 2 


0.0059 


-0.2022 


-0.6903 


0.7106 


0.7589 


0.8779 



Table 2: Unforced dynamics, varying structure: parametric inputs. Note that qi = —1 for i = 7,8,9, which sets the 
external-force magnitudes to zero. 

This experiment employs truncation criteria of v x = vr = 0.9999 for the reduced bases. Again, 
the forecasting technique employs t = and a max = 12. Figure [i] and Table [3] report the results for 
this experiment. First, note that all ROMs generate very small relative errors (e < 10~ 2 ) even though 
a mere 6 training points were (randomly) selected in a six-dimensional input parameter space. This 
strong performance can be attributed to the relative smooth dynamics characterizing the problem. 
Again, we observe that spatial-complexity reduction is necessary to generate significant speedups (the 
Galerkin ROM has a speedup of roughly one). We also note that — in the absence of the proposed 
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forecasting technique — the ROMs exhibit no temporal-complexity reduction (n = 1.0). 

When the models employ the proposed forecasting technique, the number of total Newton steps is 
nearly cut in half, as all methods generate temporal-complexity reduction factors of k = 1.71. This 
also leads to a significant improvement in speedup for all reduced-order models. These results indicate 
that the proposed technique is effective in realistic scenarios where the reduced bases are truncated 
and the inputs vary. 

To gain insight into the method's potential, Figure [5] depicts the time evolution of the first gener- 
alized unknown wi for the online and training inputs; note that this is one of the forecasted variables. 
The online inputs lead to different frequency content of the generalized unknown's temporal behavior, 
even though the qualitative behavior of the response is similar. The forecasting method performs well 
in spite of this frequency shift. This indicates that the method is reasonably robust with respect to 
frequency shifts in the system's temporal response. 



Online 
inputs 


ROM method 


relative 
error e 


No forecasting 


With forecasting 


Newton 
its 


speedup 
S 


reduction 
factor k 


Newton 
its 


speedup 
S 


reduction 
factor k, 


q *,l 


Galerkin 
Gal. + Gappy 
Gal. + coll. 


2.04 x 10" 
1.77 x 10- 
1.83 x 10~ 


3 
3 
3 


82 
82 
82 


0.998 
58.2 
59.4 


1.00 
1.00 
1.00 


48 
48 
48 


1.37 

82.4 
80.8 


1.71 
1.71 
1.71 




Galerkin 


3.86 x 10- 


3 


82 


1.03 


1.00 


48 


1.55 


1.71 




Gal. + Gappy 


3.64 x 10- 


3 


82 


54 


1.00 


48 


86.1 


1.71 




Gal. + coll. 


3.19 x 10- 


3 


82 


55.5 


1.00 


48 


88.7 


1.71 



Tabic 3: Unforced dynamics: forecast performance 




Figure 4: Unforced dynamics: responses generated by reduced-order models 



4-. 4- Forced dynamics, fixed structure 

This experiment analyzes the method for the problem in the absence of shape and material-property 
changes, but with external forcing and changes in the initial condition. Because the external forces 
are of relatively high frequency (close to the largest natural frequency of the structure) , we expect this 
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Figure 5: Unforced dynamics. First generalized unknown at online inputs (bold curve) and training inputs (thin curves) 



problem to be characterized by 'rich' dynamics: the first half of the time interval is a free-vibration 
problem, while the second half of the time interval is a high-frequency forced response. As before, 
we randomly choose six training configurations and two online configurations; Table [4] reports the 
associated parametric inputs. 



configuration 


96 


97 


98 


99 


^sample, 1 


0.4786 


0.0001 


0.4019 


-0.9561 


^samplc,2 


0.3911 


-0.2747 


0.7756 


0.4022 


^samplc,3 


-0.3415 


-0.4041 


0.7287 


0.4164 


^sample, 4 


0.0790 


-0.2275 


-0.4506 


0.4536 


^sample, 5 


0.8681 


0.9115 


0.6784 


0.2574 


^sample, 6 


0.3112 


0.9821 


0.8428 


-0.1354 


q*A 


-0.1177 


0.5200 


-0.4993 


-0.2177 


q *,2 


0.8779 


0.1817 


-0.2161 


-0.3058 



Table 4: Forced dynamics, fixed structure: parametric inputs. Note that qi = for i = 1, . . . , 5, which causes shape and 
material-property inputs to be fixed. 



Again, we employ truncation criteria of v x = Vr = 0.9999 and forecasting parameters r = and 
ctmax = 12. Figure [6] and Table [5] report the results for this experiment. 

Similar results to the experiment in Section 4.3 can be observed. All relative errors are reasonably 
small (below 2 x 10 -2 ), although the responses deviate from the full-order model around t — 20. Also, 
the proposed forecasting technique significantly improves the temporal-complexity reduction factor k 
and leads to improvements in speedups. However, the performance of the proposed technique is not 
quite as strong as in the unforced case (compare Tables [3] and [5]) . This is likely attributable to the 
presence of richer dynamics in the present experiment. Figure [7] depicts this: the temporal behavior of 
the first generalized coordinate is much less smooth than in the unforced case. However, because the 
temporal behavior remains qualitatively similar when the inputs are varied, the forecasting method can 
effectively exploit the training data to generate an accurate forecast. This highlights one advantage of 
the proposed method: it is relatively insensitive to the smoothness of the temporal response. Instead, 
it depends much more strongly on how this response is affected by input variation. 
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No forecasting 


With forecasting 


Online 
inputs 


ROM method 


relative 
error e 




Newton 
its 


speedup 
S 


reduction 
factor k 


Newton 
its 


speedup 
S 


reduction 
factor k 




Galerkin 


1.51 x 10" 


2 


100 


1.02 


1.00 


60 


1.47 


1.67 




Gal. + Gappy 


1.39 x 10" 


2 


100 


52.7 


1.00 


61 


75.2 


1.64 




Gal. + coll. 


1.38 x 10" 


2 


100 


49.5 


1.00 


71 


70.9 


1.41 




Galerkin 


1.50 x 10" 


2 


100 


1.02 


1.00 


60 


1.52 


1.67 


q *,2 


Gal. + Gappy 


1.40 x 10" 


2 


100 


52.1 


1.00 


61 


73.2 


1.64 




Gal. + coll. 


1.41 x 10" 


2 


100 


54.3 


1.00 


68 


71.8 


1.47 



Table 5: Forced dynamics, fixed structure: forecast performance 



4-. 5. Forced dynamics, varying structure 

This experiment pushes the boundaries of the method further. Here, we consider both external 
forces and variations in the structure's shape and material properties. As a result, we expect rich 
dynamics and a larger variation in the system's dynamics in the input space. Table [6] reports the 
randomly-chosen parametric inputs for this experiment. 



configuration 


91 


92 


93 


94 


95 


96 


97 


98 


99 


^sample, 1 


-0.2999 


-0.9735 


0.2374 


0.5872 


0.9248 


0.4786 


0.0001 


0.4019 


-0.9561 


^samplc,2 


0.0660 


0.3187 


-0.1555 


-0.8393 


0.6152 


0.3911 


-0.2747 


0.7756 


0.4022 


^sample, 3 


0.2395 


-0.6085 


0.2981 


-0.8126 


0.6181 


-0.3415 


-0.4041 


0.7287 


0.4164 


^sample, 4 


-0.0033 


0.3666 


-0.5749 


-0.4519 


0.2338 


0.0790 


-0.2275 


-0.4506 


0.4536 


^sample, 5 


-0.5650 


-0.9869 


0.5075 


0.2002 


0.8572 


0.8681 


0.9115 


0.6784 


0.2574 


^sample, 6 


0.9286 


-0.8243 


-0.4919 


-0.9116 


-0.7414 


0.3112 


0.9821 


0.8428 


-0.1354 


q*' 1 


-0.8722 


-0.3269 


-0.5777 


-0.4507 


0.3181 


-0.1177 


0.5200 


-0.4993 


-0.2177 


q *,2 


0.0059 


-0.2022 


-0.6903 


0.7106 


0.7589 


0.8779 


0.1817 


-0.2161 


-0.3058 



Table 6: Forced dynamics, varying structure: parametric inputs 



As before, we employ the following parameters: v x = vr — 0.9999, r = 0, and a max = 12. Figure 
[8] and Table [7] report the results for this experiment. 

For these experiments, we observe that the relative errors generated by the reduced-order models 
are significantly larger than in the previous experiments. This illustrates that the inputs are inducing 
a wider variety of dynamics, which reduced-order models have difficulty capturing. For this reason, 
we expect the proposed forecasting method to also perform worse, as the temporal behavior is (likely) 
also more difficult to predict. This is certainly the case: the temporal-complexity-reduction factors k 
are lower than in previous experiments. Nonetheless, the proposed technique results in significantly 
improved performance in all but one case. The only exception is for the Galerkin ROM applied to 9*' 1 ; 
here, the forecasting technique decreases k from 1.39 (without forecasting) to 1.33. This illustrates 
that the forecasting method can lead to degraded performance if the forecast is inaccurate; this can 
occur if the temporal behavior of the response is sufficiently rich and varies significantly with parameter 
variation. 

Thus, as expected, the method appears to perform best when the dynamics of the problem are 
relatively smooth and the temporal behavior does not drastically change as inputs vary. 

4.6. Parameter study: maximum memory a max and Newton-step threshold r 

In this experiment, we perform a parameter study to determine the parameters a max and r that 
lead to the best performance for the method. For this purpose, we run the experiments described 
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Figure 8: Forced dynamics, varying structure: responses generated by reduced-order models 




Figure 9: Forced dynamics, varying structure. First generalized unknown at online inputs (bold curve) and training 
inputs (thin curves) 
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No forecasting 


With forecasting 


Online 
inputs 


ROM method 


relative 
error e 




Newton 
its 


speedup 
S 


reduction 
factor k 


Newton 
its 


speedup 
S 


reduction 
factor k 




Galerkin 


4.95 x 10" 


a 


104 


1.21 


1.39 


109 


1.38 


1.33 




Gal. + Gappy 


1.65 x 10" 


l 


124 


8 


1.17 


94 


9.48 


1.54 




Gal. + coll. 


6.93 x 10- 


2 


120 


8.12 


1.21 


90 


11 


1.61 




Galerkin 


1.40 x 10" 


i 


95 


1.04 


1.02 


62 


1.47 


1.56 


q *,2 


Gal. + Gappy 


1.17 x 10" 


1 


95 


7.47 


1.02 


64 


10.4 


1.52 




Gal. + coll. 


1.21 x 10" 


1 


100 


7.36 


0.97 


73 


9.98 


1.33 



Table 7: Forced dynamics, varying structure: forecast performance 



in Sections |4.3||4.5| for a max = 6,9,12,15 and t = 0, 1. Then, we compute I as the average value 
of ^forecast /«no over all experiments and all three reduced-order models. Here, Kf orGcast designates 
the temporal-complexity-reduction factor achieved when the proposed forecasting method is used; K no 
denotes the temporal-complexity-reduction factor obtained without forecasting. Similarly, s denotes 
the average value of Si OICC ast/S no over all experiments and reduced-order models. Here, ^forecast denotes 
the wall-time speedup obtained when the proposed method is used; S no is the speedup when the method 
is not employed. 

Figure [10] reports the results for the parameter study. It is evident that the parameters that lead to 
best performance for this problem in both temporal-complexity-reduction improvement (t) and speedup 
improvement (s) are r = and a max = 12. Interestingly, a max = 6 yields the worst performance in 
temporal-complexity reduction. This choice corresponds to interpolation in the Gappy POD forecast, 
as the forecast employs six basis vectors. This observation is consistent with those reported in Refs. 
[2"6l 3] : interpolation when applied with Gappy POD rarely leads to the best performance in reduced- 
order modeling. 
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Figure 10: Parameter study: performance averaged over all experiments and reduced-order models 



4-. 7. Parameter study: dimension of trial subspace N 

This experiment analyzes the performance of the proposed t echn ique as a function of trial-subspace 
dimension N. We consider the experimental setup in Section 4.3 and run the experiment for v x = 
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v R = 0.9,0.99,0.999,0.9999,0.99999,0.999999. The temporal-complexity method employs r = and 



Figure 11 reports the results. These results indicate that as the trial-subspace dimension increases, 
the error in the reduced-order models' responses decreases monotonically and the performance of the 
temporal-complexity-reduction method improves, plateauing at k ~ 1.7. This can be explained as fol- 
lows: when the trial-subspace dimension increases, the ROMs become more accurate, so they generate 
responses closer to that of the full-order model. Because the time-evolution bases are generated from 
full-ordcr-modcl training simulations, a more accurate ROM implies a more accurate forecast. In turn, 
an accurate forecast leads to a greater reduction in temporal complexity. 
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Figure 11: Unforced dynamics, varying structure. Effect of trial-basis dimension TV on error and temporal-complexity- 
reduction method performance. 



5. Conclusions 

This paper has described a method for decreasing the temporal complexity of nonlinear reduced- 
order models in the case of implicit time integration. The method exploits knowledge of the dynamical 
system's temporal behavior in the form of 'time-evolution bases'; one such basis is generated for each 
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generalized coordinate of the time integrator's unknown during the (offline) training stage. During the 
(online) deployed stage, these time-evolution bases are used — along with the solution at recent time 
steps — to forecast the unknown at future time steps via Gappy POD. If this forecast is accurate, the 
Newton-like solver will converge in very few iterations, leading to computational-cost savings. 

Numerical experiments demonstrated the potential of the method to significantly improve the 
performance of nonlinear reduced-order models, even in the presence of high-frequency content in 
the dynamics. The experiments also demonstrated the effect of input parameters and trial-subspace 
dimension on the method's performance, and provided a parameter study to analyze the effect of the 
method's parameters. 

Future work includes devising a way to directly handle frequency and phase shifts in the response, 
as well as time-shifted temporal behavior. 
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Appendix A. Implicit time-integration schemes: first-order ODEs 

For notational simplicity, consider a system without parametric inputs q, and define f(x,t) = 
f (x; t,p(t)) such that 

x = f(x,t). (A.l) 
Further, denote by h the time-step size at time step n. 

Appendix A.l. Implicit linear multi-step schemes 

A linear fc-step method applied to first-order ODEs can be expressed as 

k k 

Y, ajX n ~ j = h ]T Pj (x n -i,t n ~i) , (A.2) 

3=0 j=0 

k 

where a ^ and Y] oij = is necessary for consistency. These methods are implicit if /?o 7^ 0. In 

3=0 

this case, the form of the residual is 

k k 

FT (w n ) = a w n - h/3 f{w n ,t n ) + ^a^"^' - h ^ pj {x n ~ j , i""') (A.3) 

3=1 3=1 

and the explicit state update is simply 

x n = w n . (A.4) 
Therefore, the unknown is the state at time t n . 
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Appendix A. 2. Implicit Runge-Kutta schemes 

For an s-stage Runge-Kutta scheme, the form of the residual is 

s 

K? (w n '\ . . . , w n > s ) = w n - 1 - ]{x n - x + h^2 a l3 w n -\ t"" 1 + ah), i = l,...,s (A.5) 
with the following explicit computation of the state: 

s 

x n = x 71 - 1 + h^2b lW n >\ (A.6) 
i=i 

The unknowns correspond to the velocity x at times t n ~ x + ah, i = 1, . . . , s. 

Appendix B. Implicit time-integration schemes: second-order ODEs 

For notational simplicity, consider a second-order differential equations without parametric inputs 
q and define g (x, x,t) = g (x, x; t,p(t)) such that 

x = g(x,x,t). (B.l) 

Appendix B.l. Implicit Nystrom method 

Nystrom methods are partitioned Runge-Kutta schemes applied to second-order ODEs. They lead 
to the following representation for the residuals: 

(s s 
x™- 1 +c l hx n ~ 1 +h 2 ^a l3 w n -\x n - 1 + hJ2a lJ w n <\t n - 1 + c,h 
3 = 1 3 = 1 

'(B.2) 

The state and velocity are updated explicitly as 

s 

x n = x 11 - 1 + hx n - x +h 2 ^ hw n ' 1 (B.3) 

i=l 

s 

x n = x 11 - 1 + h^b lW n > 1 . (B.4) 
i=i 

The unknowns correspond to the acceleration x at times t 71 ^ 1 + Cj/i, i = 1, . . . , s. 

Appendix B.2. Implicit Newmark method 

The implicit Newmark method leads to the following residuals: 

R n (w n ) =w n -g (x™- 1 + hx n - 1 + y [(1 - 2/3) x^ 1 + 20w n ] , i™" 1 + h [(1 - 7) x™" 1 + jw n ] , t n ^j 

(B.5) 

The state and velocity are explicitly updated as 

h 2 

x n = x 11 - 1 + hi"- 1 + — [(1 - 2/3) x n - x + 2/3w n ] (B.6) 
x n ^i"- 1 +h[{l- 1 )x n - 1 +710"]. (B.7) 
Here, the unknown corresponds to the acceleration £ at time t". 



24 



Appendix C. Proper orthogonal decomposition 

Algorithm [2] describes the method for computing a proper-orthogonal-decomposition (POD) basis 
given a set of snapshots. The method essentially amounts to computing the singular value decompo- 
sition of the snapshot matrix. The left singular vectors define the POD basis. 



Algorithm 2 Proper-orthogonal-decomposition basis computation (normalized snapshots) 

Input: Set of snapshots X = {ttfj}^ C K. , energy criterion v £ [0, 1] 
Output: $ e (A» 

l: Compute thin singular value decomposition W = UY,V T , where W = [wi/\\wi\\ ■ ■ ■ w nul /\\w nw \\]. 
2: Choose dimension of truncated basis N = n e {v) 1 where 

n e (v) = arg min i (C.l) 

n n w 

V(y) = {n e {1, . . . , n w } | £ a}/ £ > „}, (C.2) 

i=l i=l 

and S = diag (<Tj). 
3: $ e (A", z/) = [ui ■ ■ ■ Uft] , where U =[u\ ■ ■ ■ u n J\. 
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